pacman::p_load(sf, raster, spatstat, tmap, tidyverse, rvest, geojsonsf)Spatial Point Patterns Analysis
1.0 Context:
Spatial Point Pattern Analysis involves evaluating the pattern or distribution of a set of points on a surface. These points can represent the locations of:
Events, such as crimes, traffic accidents, or disease outbreaks, or
Business services, like coffee shops and fast food outlets, or facilities such as childcare and eldercare centers.
The specific questions we would like to answer are as follows:
are the childcare centres in Singapore randomly distributed throughout the country?
if the answer is not, then the next logical question is where are the locations with higher concentration of childcare centres?
2.0 Downloading the Data sets
To provide answers to the questions above, three data sets will be used. They are:
CHILDCARE, a point feature data providing both location and attribute information of childcare centres. It was downloaded from Data.gov.sg and is in geojson format. Link hereMP14_SUBZONE_WEB_PL, a polygon feature data providing information of URA 2014 Master Plan Planning Subzone boundary data. It is in ESRI shapefile format. This data set was also downloaded from Data.gov.sg. Link hereCostalOutline, a polygon feature data showing the national boundary of Singapore. It is provided by SLA and is in ESRI shapefile format. Link here
3.0 Installing and loading R packages
In this hands-on exercise, five R packages will be used, they are:
sf, a relatively new R package specially designed to import, manage and process vector-based geospatial data in R.
spatstat, which has a wide range of useful functions for point pattern analysis. In this hands-on exercise, it will be used to perform 1st- and 2nd-order spatial point patterns analysis and derive kernel density estimation (KDE) layer.
raster which reads, writes, manipulates, analyses and model of gridded spatial data (i.e. raster). In this hands-on exercise, it will be used to convert image output generate by spatstat into raster format.
maptools which provides a set of tools for manipulating geographic data. In this hands-on exercise, we mainly use it to convert Spatial objects into ppp format of spatstat.
tmap which provides functions for plotting cartographic quality static point patterns maps or interactive maps by using leaflet API.
Use the code chunk below to install and launch the five R packages.
4.0 Spatial Data Wrangling
4.1 Importing Spatial Data
childcare_sf <- st_read("data/child-care-services-geojson.geojson") %>%
st_transform(crs = 3414)Reading layer `child-care-services-geojson' from data source
`C:\Users\jiale\Desktop\IS415\IS415-GAA\Hands_On_Exercises\Hands_On_Exercise_4\data\child-care-services-geojson.geojson'
using driver `GeoJSON'
Simple feature collection with 1545 features and 2 fields
Geometry type: POINT
Dimension: XYZ
Bounding box: xmin: 103.6824 ymin: 1.248403 xmax: 103.9897 ymax: 1.462134
z_range: zmin: 0 zmax: 0
Geodetic CRS: WGS 84
sg_sf <- st_read(dsn = "data", layer="CostalOutline")Reading layer `CostalOutline' from data source
`C:\Users\jiale\Desktop\IS415\IS415-GAA\Hands_On_Exercises\Hands_On_Exercise_4\data'
using driver `ESRI Shapefile'
Simple feature collection with 60 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 2663.926 ymin: 16357.98 xmax: 56047.79 ymax: 50244.03
Projected CRS: SVY21
mpsz_sf <- st_read(dsn = "data", layer = "MP14_SUBZONE_WEB_PL")Reading layer `MP14_SUBZONE_WEB_PL' from data source
`C:\Users\jiale\Desktop\IS415\IS415-GAA\Hands_On_Exercises\Hands_On_Exercise_4\data'
using driver `ESRI Shapefile'
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
4.1.1 DIY:Use the appropriate SF function to retrieve the referencing system information of these geospatial data.
Simple, use the st_crs function from SF to check and print the crs information
# Retrieve CRS information
childcare_crs <- st_crs(childcare_sf)
sg_crs <- st_crs(sg_sf)
mpsz_crs <- st_crs(mpsz_sf)
# Print CRS information
print(childcare_crs)Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
print(sg_crs)Coordinate Reference System:
User input: SVY21
wkt:
PROJCRS["SVY21",
BASEGEOGCRS["SVY21[WGS84]",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ID["EPSG",6326]],
PRIMEM["Greenwich",0,
ANGLEUNIT["Degree",0.0174532925199433]]],
CONVERSION["unnamed",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
print(mpsz_crs)Coordinate Reference System:
User input: SVY21
wkt:
PROJCRS["SVY21",
BASEGEOGCRS["SVY21[WGS84]",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ID["EPSG",6326]],
PRIMEM["Greenwich",0,
ANGLEUNIT["Degree",0.0174532925199433]]],
CONVERSION["unnamed",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
4.1.2 DIY: Assign the correct CRS to MPSZ_SF and SG_SF Simple Feature Data frames.
notice that the MPSZ_SF and SG_SF is in World Geodetic System 1984 format, we need set the correct crs to these data and we can do so using the st transform. We can do so using the transform method
mpsz_sf <- st_read(dsn = "data", layer = "MP14_SUBZONE_WEB_PL") %>%
st_transform(crs = 3414)Reading layer `MP14_SUBZONE_WEB_PL' from data source
`C:\Users\jiale\Desktop\IS415\IS415-GAA\Hands_On_Exercises\Hands_On_Exercise_4\data'
using driver `ESRI Shapefile'
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
sg_sf <- st_read(dsn = "data", layer = "CostalOutline") %>%
st_transform(crs = 3414)Reading layer `CostalOutline' from data source
`C:\Users\jiale\Desktop\IS415\IS415-GAA\Hands_On_Exercises\Hands_On_Exercise_4\data'
using driver `ESRI Shapefile'
Simple feature collection with 60 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 2663.926 ymin: 16357.98 xmax: 56047.79 ymax: 50244.03
Projected CRS: SVY21
print(st_crs(mpsz_sf))Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
print(st_crs(sg_sf))Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
4.1.3 Change the referencing System to Singapore National Projected Coordinate System
Understanding the CRS in Our Data:
MPZ and Coastal Data:
CRS:
SVY21, which is the Singapore National Projected Coordinate System based on WGS84.Description: This is a common projected coordinate system used in Singapore for accurate mapping.
Childcare Data:
CRS:
SVY21 / Singapore TM(Transverse Mercator projection).Description: This is also a projection based on SVY21, specifically using the Transverse Mercator projection. It is very close to the SVY21 system, with minor differences in how the projection is handled.
Given that the map file serves as the base, we want all spatial data to overlay correctly, we should:
Transform the GeoJSON Data to Match the Map File’s CRS:
- Since our MPZ and Coastal data are already in
SVY21(EPSG:3414), transform the GeoJSON data toEPSG:3414as well.
- Since our MPZ and Coastal data are already in
Rationale:
This approach ensures that the childcare locations from the GeoJSON data will be accurately plotted within the boundaries and context provided by the map file (MPZ and Coastal data).
It avoids potential issues with misalignment, especially since oour base map data is already set up in a local projection suitable for Singapore.
# Transform Childcare data to match the base map's CRS (EPSG:3414)
childcare_sf <- st_read("data/child-care-services-geojson.geojson") %>%
st_transform(crs = 3414)Reading layer `child-care-services-geojson' from data source
`C:\Users\jiale\Desktop\IS415\IS415-GAA\Hands_On_Exercises\Hands_On_Exercise_4\data\child-care-services-geojson.geojson'
using driver `GeoJSON'
Simple feature collection with 1545 features and 2 fields
Geometry type: POINT
Dimension: XYZ
Bounding box: xmin: 103.6824 ymin: 1.248403 xmax: 103.9897 ymax: 1.462134
z_range: zmin: 0 zmax: 0
Geodetic CRS: WGS 84
# Now, all datasets should be aligned in the same CRS4.1.4 Checking for validity of maps
When working with spatial data, it’s crucial to ensure that all geometries are valid. Invalid geometries can cause errors in analysis and visualization.
- Checking Validity with
st_is_valid(): - Identifying Invalid Geometries:
- Fixing Invalid Geometries with
st_make_valid()
mpsz_validity <- st_is_valid(mpsz_sf)
mpsz_invalid <- which(!mpsz_validity)
if (length(mpsz_invalid) > 0) {
print("MPZ Invalid!")
print(mpsz_sf[mpsz_invalid, ])
} else {
print("it's valid!")
}[1] "MPZ Invalid!"
Simple feature collection with 9 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 12535.88 ymin: 21678.35 xmax: 56396.44 ymax: 49291.03
Projected CRS: SVY21 / Singapore TM
OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C CA_IND
19 19 2 SOUTHERN GROUP SISZ02 N
20 20 1 SENTOSA SISZ01 N
24 24 1 MARITIME SQUARE BMSZ01 N
122 122 9 JURONG PORT JESZ09 N
123 123 3 SAMULUN BLSZ03 N
128 128 9 PANDAN CLSZ09 N
258 258 4 PASIR RIS PARK PRSZ04 N
302 302 1 NORTH-EASTERN ISLANDS NESZ01 N
320 320 9 NORTH COAST WDSZ09 N
PLN_AREA_N PLN_AREA_C REGION_N REGION_C
19 SOUTHERN ISLANDS SI CENTRAL REGION CR
20 SOUTHERN ISLANDS SI CENTRAL REGION CR
24 BUKIT MERAH BM CENTRAL REGION CR
122 JURONG EAST JE WEST REGION WR
123 BOON LAY BL WEST REGION WR
128 CLEMENTI CL WEST REGION WR
258 PASIR RIS PR EAST REGION ER
302 NORTH-EASTERN ISLANDS NE NORTH-EAST REGION NER
320 WOODLANDS WD NORTH REGION NR
INC_CRC FMEL_UPD_D X_ADDR Y_ADDR SHAPE_Leng SHAPE_Area
19 5809FC547293EA2D 2014-12-05 29815.09 23412.59 25626.977 2206319
20 A6FCDC9C447952CB 2014-12-05 27593.94 25813.35 17496.194 4919132
24 C1AC31ABF9978DDB 2014-12-05 25805.79 27911.42 13737.116 2701634
122 0664CA7EF6504AE5 2014-12-05 15250.74 32183.92 11355.002 2464857
123 F78E0287D3F24214 2014-12-05 13418.49 32264.59 8738.679 1940693
128 A6EE4A49376B69C4 2014-12-05 19228.60 32265.40 5689.647 1312923
258 9856E3CDCF57AD96 2014-12-05 41529.80 40218.94 8533.964 1719705
302 92BC3E09C68F3B52 2014-12-05 50424.79 42612.88 62436.235 67250563
320 898B2436858382A1 2014-12-05 22147.04 48031.55 10847.882 2450784
geometry
19 MULTIPOLYGON (((29712.51 23...
20 MULTIPOLYGON (((26858.1 266...
24 MULTIPOLYGON (((26514.58 28...
122 MULTIPOLYGON (((14483.48 31...
123 MULTIPOLYGON (((12861.38 32...
128 MULTIPOLYGON (((19680.06 31...
258 MULTIPOLYGON (((41343.11 40...
302 MULTIPOLYGON (((52567.43 46...
320 MULTIPOLYGON (((21693.06 48...
Notice that MPZ has 9 invalidity of sub zones here, so we have to make it valid through the function make valid. Once it’s valid we then check again
mpsz_sf <- st_make_valid(mpsz_sf)
mpsz_validity <- st_is_valid(mpsz_sf)
mpsz_invalid <- which(!mpsz_validity)
if (length(mpsz_invalid) > 0) {
print("MPZ Invalid!")
print(mpsz_sf[mpsz_invalid, ])
} else {
print("it's valid!")
}[1] "it's valid!"
sg_validity <- st_is_valid(sg_sf)
sg_invalid <- which(!sg_validity)
if (length(sg_invalid) > 0) {
print("SG Invalid!")
print(mpsz_sf[mpsz_invalid, ])
} else {
print("it's valid!")
}[1] "SG Invalid!"
Simple feature collection with 0 features and 15 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C CA_IND PLN_AREA_N
[7] PLN_AREA_C REGION_N REGION_C INC_CRC FMEL_UPD_D X_ADDR
[13] Y_ADDR SHAPE_Leng SHAPE_Area geometry
<0 rows> (or 0-length row.names)
In SG_SF there’s one invalid as well, so we apply the fix.
sg_sf <- st_make_valid(sg_sf)
sg_validity <- st_is_valid(sg_sf)
sg_invalid <- which(!sg_validity)
if (length(sg_invalid) > 0) {
print("SG Invalid!")
print(mpsz_sf[sg_invalid, ])
} else {
print("it's valid!")
}[1] "it's valid!"
Notice that childcare is a geojson data and it houses it’s data in the description column, we need to break this up to get more meaningful data.
We can do a simple extraction from the Description attribute and map the data better. Assuming that each Table Row (TR) contains a Table Head (TH) and a Table Data (TD), we can map the data accordingly.
childcare_validity <- st_is_valid(childcare_sf)
childcare_invalid <- which(!childcare_validity)
if (length(childcare_invalid) > 0) {
print("ChildCare Invalid!")
print(childcare_sf[childcare_invalid, ])
} else {
print("it's valid!")
}[1] "it's valid!"
# Ensure the geometry column is preserved
geometry_column <- st_geometry(childcare_sf)
parse_description <- function(html_string) {
html <- read_html(html_string)
html <- html %>% html_nodes("tr") %>% .[!grepl("Attributes", .)]
headers <- html %>% html_nodes("th") %>% html_text(trim = TRUE)
values <- html %>% html_nodes("td") %>% html_text(trim = TRUE)
# Handle cases where the number of headers and values don't match
if (length(headers) != length(values)) {
max_length <- max(length(headers), length(values))
headers <- c(headers, rep("ExtraHeader", max_length - length(headers)))
values <- c(values, rep("NULL", max_length - length(values)))
}
setNames(values, headers)
}
# Apply parsing function, unnest the description fields, and remove the original 'Description' column
childcare_sf <- childcare_sf %>%
mutate(Description_parsed = map(Description, parse_description)) %>%
unnest_wider(Description_parsed) %>%
select(-Description) # Remove the original 'Description' column
# Overwrite the 'Name' column with the 'LANDYADDRESSPOINT' column values
childcare_sf <- childcare_sf %>%
mutate(Name = NAME) # Overwrite 'Name' with 'LANDYADDRESSPOINT'
# Replace empty strings or NA across all columns with "NULL"
childcare_sf <- childcare_sf %>%
mutate(across(!geometry, ~ ifelse(is.na(.) | . == "", "NULL", .)))
# Reassign the geometry to the dataframe
st_geometry(childcare_sf) <- geometry_column
# Ensure it's still an sf object
class(childcare_sf)[1] "sf" "tbl_df" "tbl" "data.frame"
4.2 Mapping the geospatial datasets.
Using the mapping methods you learned in Hands-on Exercise 3, prepare a static map
# Suppress the tmap mode message
suppressMessages({
tmap_mode("plot") # Use "view" for an interactive map or "plot" for a static map
})
# Create the map
tm <- tm_shape(mpsz_sf) +
tm_polygons(col = "grey", border.col = "black", alpha = 0.5) + # Base map with subzones
tm_shape(childcare_sf) +
tm_dots(col = "black", size = 0.05) + # Plot childcare locations as dots
tm_layout(
main.title = "Childcare Locations on Singapore Map",
main.title.position = c("center"), # Center the title at the top
outer.margins = c(0.1, 0, 0, 0), # Adjust outer margins to make space for the title
legend.outside = TRUE, # Keep legend outside the map area
legend.outside.position = "bottom" # Position the legend at the bottom
)
tm
we can also prepare a pin map by using the code chunk below.
suppressMessages({
tmap_mode("view") # Use "view" for an interactive map or "plot" for a static map
})
tm <- tm_shape(mpsz_sf) +
tm_polygons(col = "grey", border.col = "black", alpha = 0.5) + # Base map with subzones
tm_shape(childcare_sf) +
tm_dots(col = "black", size = 0.05) + # Plot childcare locations as dots
tm_layout(
title = "Childcare Locations on Singapore Map",
title.position = c("center"), # Center the title at the top
outer.margins = c(0.1, 0, 0, 0), # Adjust outer margins to make space for the title
legend.outside = TRUE, # Keep legend outside the map area
legend.outside.position = "bottom" # Position the legend at the bottom
)
tm